UEval: A Benchmark for Unified Multimodal Generation

Bo Li, Yida Yin, Wenhao Chai, Xingyu Fu*, Zhuang Liu*

Princeton University

(* indicates co-advising)

What is UEval?

UEval comprises 1,000 expert-curated prompts that require both images and text in the model outputs, sourced from 8 diverse real-world domains.

teaser

Full-Leaderboard

view the full leaderboard ↗

view UEval problems

submit your results

Submit your results by opening an issue in our GitHub.

BibTeX

@article{li2026ueval,
    title    = {UEval: A Benchmark for Unified Multimodal Generation},
    author    = {Li, Bo and Yin, Yida and Chai, Wenhao and Fu, Xingyu and Liu, Zhuang},
    journal   = {arXiv preprint arXiv:2601.22155},\n    year      = {2026},
    journal  = {}
}

Website template modified from https://www.tbench.ai/.